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I introduce a framework in which a variety of probabilistic theories can be defined, including clas- 
sical and quantum theories, and many others. From two simple assumptions, a tensor product rule 
for combining separate systems can be derived. Certain features, usually thought of as specifically 
quantum, turn out to be generic in this framework, meaning that they are present in all except 
classical theories. These include the non-unique decomposition of a mixed state into pure states, a 
theorem involving disturbance of a system on measurement (suggesting that the possibility of secure 
key distribution is generic), and a no-cloning theorem. Two particular theories are then investigated 
in detail, for the sake of comparison with the classical and quantum cases. One of these includes 
states that can give rise to arbitrary non-signalling correlations, including the super-quantum corre- 
lations that have become known in the literature as Nonlocal Machines or Popescu-Rohrlich boxes. 
By investigating these correlations in the context of a theory with well-defined dynamics, I hope to 
make further progress with a question raised by Popescu and Rohrlich, which is, why does quan- 
tum theory not allow these strongly nonlocal correlations? The existence of such correlations forces 
much of the dynamics in this theory to be, in a certain sense, classical, with consequences for tele- 
portation, cryptography and computation. I also investigate another theory in which all states are 
local. Finally, I raise the question of what further axiom(s) could be added to the framework in 
order uniquely to identify quantum theory, and hypothesize that quantum theory is optimal for 
computation. 



PACS numbers: 03.67.-a, 03.65.Ta 



I. INTRODUCTION 



The question is periodically raised, what is responsible 
for the power of quantum computation (or cryptography, 
or information processing in general)? At a recent meet- 
ing in Konstanz [1], speakers referred to quantum entan- 
glement; the superposition principle; the exponentially 
growing size of Hilbert space with the number of qubits; 
nonlocality and contextuality; the possibility of continu- 
ous reversible transformations between pure states; and 
the so-called sign problem in Monte Carlo simulations of 
certain types of quantum system 0. It is perhaps un- 
surprising that there are so many different answers. The 
problem is that the results of quantum information the- 
ory are already well understood as consequences of the 
quantum formalism, and it is not clear that simply point- 
ing to aspects of that formalism tells us anything new. 
What we are really looking for is a better understanding 
of the connections between information processing and 
physical principles in general. 

Such an understanding could be gained by studying in- 
formation processing in a broader range of theories than 
classical and quantum, where different physical principles 
may hold. For any theory, whether it applies to Nature or 
not, one can consider the information processing possibil- 
ities of this theory, the differences from those of classical 
or quantum theory, and attempt to trace these possi- 
bilities back to the fundamental features of the theory. 
Some authors have indeed investigated unrealistic theo- 
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ries, with a view to understanding the relevant features 

To make further progress along these lines, I intro- 
duce an operational framework for probabilistic theories 
in which a broad range of different theories can be de- 
fined. The framework, described in Sections [TTl IIIII and 
IIVI is based on that used by Hardy in his derivation of 
quantum theory from simple axioms [14]. The basic idea 
is that a state is represented as a vector of probabilities 
of measurement outcomes. Transformations of a system 
must correspond to linear transformations of this vec- 
tor. By including probabilistic, that is normalization- 
decreasing, transformations, a unified account of trans- 
formations and measurements can be given. Rather than 
employ any of Hardy's axioms, I introduce two assump- 
tions that concern how separate systems combine to form 
a joint system. The first is that operations on the sepa- 
rate systems commute (this implies a no-signalling prin- 
ciple), and the second is that the state of the joint system 
can be completely specified by joint probabilities for local 
measurements. From these assumptions a tensor prod- 
uct rule can be derived. This removes at least some of 
the mystery from the quantum tensor product rule and 
generalizes a derivation by Fuchs . 

The resulting framework includes classical probabilis- 
tic theories, quantum theory, and many other theories 
besides. The first thing one notices is that certain phe- 
nomena, usually thought of as specifically quantum, are 
in fact generic. This means that they either appear in 
all theories, or they appear in all theories except clas- 
sical theories, which emerge as a very special case. As 
shown in Section |Vl these phenomena include the non- 
unique decomposition of a mixed state into pure states, a 
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theorem concerning the disturbance of a system on mea- 
surement, and the no-cloning theorem. (These observa- 
tions are complementary to those of Ref. [l^ , where it is 
noted that similar properties hold in nonlocal but non- 
signalling theories.) 

In addition to looking at generic properties of theo- 
ries, it is useful to analyze at least one or two novel 
theories in detail. These then provide well-understood 
examples that can be contrasted with the classical and 
quantum cases. Thus the rest of this work is devoted 
to an analysis of two theories that admit a particularly 
natural definition. The first of these allows arbitrary cor- 
relations between measurements on separated systems, 
as long as they are non-signalling. I call it Generalized 
Non-Signalling Theory (GNST). The correlations allowed 
by this theory can be more nonlocal than quantum the- 
ory allows, and include the super-quantum correlations 
that have come to be known variously in the literature 
as Popescu-Rohrlich (PR) boxes, or Nonlocal Machines 

[n, [H H [s s [Hi, m m, m, m, m . Popescu and 

Rohrlich raised the question of why quantum theory does 
not allow these correlations. An investigation of a com- 
plete theory, with dynamics, that does include the cor- 
relations may help to answer this question. The second 
theory allows the same states of single systems as GNST, 
but does not allow any violation of Bell inequalities. For 
this reason it is called Generalized Local Theory (GLT). 

One of the interesting things about GNST is that there 
are many direct analogues of quantum phenomena (in 
addition to the generic phenomena mentioned above). 
These include entanglement, nonlocality, a form of con- 
textuality, and the Einstein-Podolski-Rosen (EPR) para- 
dox. (Interestingly, a quite different toy theory intro- 
duced by Spekkens displays many of these phenomena 
too [IT'].) However, there arc also differences with quan- 
tum theory. A central insight of this work is that there 
is a trade-off between the allowed states of a theory and 
the allowed dynamics. This follows from the simple fact 
that dynamics has to act in such a way that allowed 
states are taken to allowed states. In the case of GNST, 
the fact that all non-signalling correlations are possible 
means that the dynamics is highly restricted. In fact, I 
show in Section IVTl that the dynamics of single systems in 
GNST is essentially classical, corresponding to no more 
than relabellings of measurements and outcomes. This 
result is extended to transformations and measurements 
on simple kinds of bipartite systems (more complicated 
cases are still open). GLT is in some sense intermediate, 
with transformations on single systems similarly simple, 
but with transformations on bipartite systems including 
other possibilities. 

These conclusions about dynamics have consequences 
for information processing, discussed in Section [VIII For 
example, there is no teleportation in GNST, despite the 
existence of highly nonlocal states that might have been 
thought to facilitate a task like teleportation. Key dis- 
tribution is possible in GNST and 1-2 oblivious transfer 
in both GNST and GLT. Other cryptographic possibili- 



ties, such as key distribution in GLT, or bit commitment 
in either theory, are open questions. A natural circuit- 
type model of computation can be defined for any theory 
in the framework. The states and dynamics together in 
GLT are sufficiently restricted that computation can be 
simulated efficiently by a classical computer. The theo- 
rems concerning dynamics in GNST give evidence that 
computation in this theory can also be simulated effi- 
ciently by a classical computer (despite the existence of 
super-entangled states). The fact that quantum theory, 
unlike GNST and GLT, achieves such a harmonious bal- 
ance of states and dynamics leads to the following hy- 
pothesis that I leave open: a quantum computer can sim- 
ulate computation in any theory in the framework with at 
most polynomial overhead. 

Finally, two motivations are not directly connected 
with information processing. On the face of it, most of 
the theories that can be written down in the framework 
described suffer from similar interpretational problems as 
quantum theory. For example, are pure states in one of 
these theories best regarded as complete descriptions of 
individual reality, as describing only ensembles, or as de- 
scriptions of agents' degrees of belief? Although I do not 
do this in this paper, consideration of these questions in 
a broader framework may shed new light on the quan- 
tum theoretical problems. The other motivation is to 
stimulate research into finding ways of deriving quantum 
theory from physical principles (instead of laying down 
a list of mathematical axioms, as per the standard text- 
book approach). What principles could be used to rule 
out the other theories described and leave only quantum 
theory? One reason for deriving quantum theory from 
physical principles is that by modifying one or another 
of the principles, we may discover new ways of going be- 
yond quantum theory. 



II. A FRAMEWORK FOR PROBABILISTIC 
THEORIES 

This section describes in some detail a general opera- 
tional framework in which probabilistic theories can be 
written down. All theories in this framework share the 
following features with classical and quantum theory. 

1. Local operations on distinct subsystems commute. 
In the case of a bipartite system AB, for example, 
this means that if an operation is performed on sys- 
tem A alone, and an operation on system B alone, 
it does not matter what order the operations were 
performed in. 



2. The global state of a composite system is deter- 
mined by correlations between local measurements. 
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A. States and Operations 

Consider a laboratory containing preparation devices 
and operation devices. Preparation devices prepare a 
system in a given state and operation devices act on a 
system, in general changing its state. When an opera- 
tion device is used, there may be several different out- 
comes, each occurring with some probability. Each out- 
come is identified by a different macroscopic event (for 
example, a different light being illuminated on the de- 
vice, or a different position of a pointer). Thus opera- 
tion devices serve to perform both transformations and 
measurements. Given the state of a system, it should 
be possible to calculate the probabilities of measurement 
outcomes for any measurement. Conversely, if the prob- 
abilities of measurement outcomes for any measurement 
are known, then the state is known. 

Suppose that systems come in different types, where in 
quantum theory, for example, the type of system corre- 
sponds to the dimension of its Hilbert space. For each 
type of system, there is some finite set of measure- 
ments, each with a finite number of outcomes, such that 
the state of the system can be completely specified by 
listing the probabilities for these outcomes. For exam- 
ple, in quantum theory, the state of a spin-1/2 particle 
can be specified by giving the probabilities of obtain- 
ing spin-up on measuring in the x, y and z directions. 
Call the measurements in T fiducial measurements and 
T the fiducial set. In general, there will be other mea- 
surements that can be performed on a system that are 
not contained in the fiducial set (a measurement of spin 
in some direction at 45° to the z-axis, say). The probabil- 
ities of outcomes of these measurements can nevertheless 
be determined from the state. We ignore the possibil- 
ity of states requiring an infinite number of probabilities 
to be specified (despite the fact that quantum theory 
includes infinite dimensional systems and classical prob- 
ability theory infinite sample sets). The set of fiducial 
measurements need not be unique. In general it will be 
possible to find a different set (perhaps involving a dif- 
ferent number of measurements with different numbers 
of outcomes) that also suffices to specify the state. 

This is essentially the framework described by Hardy 
[3]) who introduced the term fiducial for the state- 
defining measurements. (See also [Is, 27, 28;, [1^, where 
the idea of representing a state via probabilities for mea- 
surement outcomes is also explored.) Unlike Hardy, we 
shall assume for convenience that the degrees of freedom 
expressed in the state are internal degrees of freedom, 
and that all measurements are measurements of internal 
degrees of freedom. With respect to spacetime degrees 
of freedom, systems behave classically, having a definite 
position and velocity at all times. This seems the most 
natural position to take given that we are most interested 
in the information processing properties of the different 
theories considered. However, it would be interesting to 
extend this work, and to consider what Nature would be 
like if all degrees of freedom, including those of space- 



time, were described by a theory like one of the ones pre- 
sented here (but extended to allow for infinite-outcome 
measurements) . 

The above is summarized by 

Assumption 1 The state of a single system can he com- 
pletely specified by listing the probabilities for the out- 
comes of some subset T of all possible measurements. 
These are the fiducial measurements. These probabilities 
can be written arranged in a vector. 
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P{a = i\X = j) is the probability of getting outcome i 
when fiducial measurement j € J-' is performed on the 
system. 



Normalization of the state would require that 
Y.P{a = t\X ^j) = l Vj, 



(2) 



where the sum ranges over all the values i that the out- 
come can take for a particular measurement. It is conve- 
nient also to give a meaning to unnormalized states (just 
as in quantum theory it is sometimes convenient to write 
down unnormalized density matrices). Suppose that a 
system is prepared in some (normalized) state and an 
operation performed with an outcome i that is obtained 
with probability less than 1. There is an unnormalized 
state associated with i, each entry of which is the joint 
probability of getting i followed by a particular outcome 
for a subsequent fiducial measurement. This implies that 
unnormalized states satisfy 



P{a = i'\X =J)=J2 = = j") = ^ ^J'-?" 

(3) 

with < c < I. In the case described, c is the probability 
of the outcome i. This idea generalizes to chains of opera- 
tions, thus operations should be defined on unnormalized 
states as well as on normalized ones. Define 



|P|^^P(a = z|X = j), 



(4) 



where the right hand side is independent of the choice of 
j. The notation |P| is used throughout and should not 
be confused with more usual definitions of the norm of a 
vector. 

Suppose that for each type of single system, the fidu- 
cial measurements are fixed. A particular theory will 
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specify, for each type of system, a set of allowed vectors 
P. These correspond to physically possible states of a 
system, i.e., states that can actually be prepared using 
one of the preparation devices. There is no reason to 
suppose that all vectors P that can be written down can 
actually be prepared. For example, in quantum theory, 
one cannot prepare a system that will with certainty re- 
turn the outcome spin-up for spin measurements in both 
the z- and x-directions. Call the set of allowed states S 
(where there is a different S for each type of system but 
we suppress this dependence). 

Assumption 2 For each type of system, the set of al- 
lowed normalized states is closed and convex. The com- 
plete set of states S is the convex hull of the set of allowed 
normalized states and 0. 

is the vector with all entries 0. The idea behind this as- 
sumption is that it is always possible to toss a biased coin 
and subsequently to be interested only in the joint prob- 
abilities of getting given measurement outcomes along 
with heads. In this way one can 'prepare' unnormalized 
states. If heads occurs with probability zero, the state 
is prepared. Convexity of S corresponds to the assump- 
tion that if it is possible to prepare states Pi and P2 , then 
it is also possible to prepare any probabilistic mixture of 
the two states. One may toss a coin, prepare either Pi or 
P2 depending on the outcome, and then forget the out- 
come.^ Extreme points of S apart from are pure states. 
States that are neither pure nor are mixed. Mixed states 
can be written as a convex sum of pure states and 0, but 
this sum need not be unique. 

Notice from Eq. ^ that S lies in a subspace of the 
complete vector space. In general, we allow for the possi- 
bility that P is an over-complete description of the state 
of a system. Thus there may be other linear constraints 
that apply apart from Eq. ([3]) implying that S lies in a 
smaller subspace still. 

When an operation is performed, each outcome is as- 
sociated with a transformation of the state of the system, 
i.e., with a map from states to states: 



P^P'^ f{P). 



(5) 



Some operations have only one outcome and the corre- 
sponding transformation preserves normalization of the 
state (in quantum theory, these are the trace-preserving 
completely positive maps). If an outcome occurs with 
probability < 1, then it is associated with a transfor- 
mation that decreases the normalization of the state (in 
quantum theory, these are trace-decreasing completely 
positive maps). In the most general case, one could 
consider operations that change the system into a sys- 
tem of a different type (just as in quantum theory one 



sometimes considers completely positive maps between 
Hilbert spaces of different dimension) . In this work I as- 
sume that operations do not change the type of system, 
although the appropriate generalization is not usually too 
difhcult. 

Consider a transformation acting on a system that is 
in a mixed state, that is a state P such that 



P 



(6) 



where the Pi are allowed states and where < < 1 
and Qi = I. One way of preparing a system in such 

a state would be to prepare a system in the state Pi 
with probability qi and then to forget the value of i. In 
this case the transformed P must be the same convex 
combination of the transformed Pi, that is 

It follows from this that the action of / on the set of 
allowed states P can be represented as 



P ^ M.P, 



(8) 



where M is a matrix, i.e, / is a linear map. This is 
not completely obvious from Eq.([7]), since the equation 
involves only convex combinations, and furthermore only 
applies for those Pi Cz S. A rigorous proof is given in 
Appendix |^ 

An operation corresponds to a set of matrices {Mi}.^ 
The unnormalized state associated with the ith outcome 
is Mi.P, and the probability of the ith outcome is 



m.p\ 
\p\ 



(9) 



For each type of system, a particular theory will specify a 
set of allowed operations. Denote this set O. An element 
of O is a set of transformations {M^}, and must be such 
that the following holds. 



Constraint 1 



< ' I ' < 1 



E 



1^1 

\M,.P\ 
\P\ 



Vi, Pes, 



1 VP G 5, 



M,.PeS yi,PeS. 



(10) 

(11) 

(12) 



The assumption is also stated in such a manner as to rule out 
the possibility of an unnormalized state without a corresponding 
normalized state. 



^ A note on terminology. I shall continue to use the term operation 
to refer to the experiment with a number of different outcomes 
corresponding to the set {Mi}, and the term transformation to 
refer to a single, in general normalization-decreasing, Mi. 
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A further constraint is that each transformation Mi must 
resuh only in allowed states when it acts on a system 
that is part of a larger multi-partite system (see next 
section). The following assumption results in some loss 
of generality but also makes things simpler. 

Assumption 3 For each type of system, there is a set 
T of allowed transformations. A set of transformations 
{Mi} is an element of O if and only if Mi £ T ^i, and 
Eq. is satisfied. The set T includes the transforma- 
tion that maps all P to and is convex. 

With this assumption, once T is given, a separate spec- 
ification of O is not needed. The reasons for convexity 
are similar to those given for Assumption [21 

As mentioned above, the formalism of operations al- 
ready includes measurements. Sometimes one is not in- 
terested in the state after measurement but only in the 
probabilities of the different outcomes. In this case it is 
convenient to associate with an operation {Mi} a set of 
vectors {Ri} such that 

R^.P = \M,.P\ VP e S. (13) 

Such a set can always be found. For a normalized P, the 
probability of the ith outcome is then given by Ri.P. It 
does not matter if the vector Ri is not unique - this sim- 
ply means that different vectors can represent the same 
measurement outcome. Denote by M the set of all sets 
{Ri} such that Eq. ^ holds for some {MJ Mis 
the set of allowed measurements. Denote by TZ the set of 
allowed measurement vectors, that is, the set of vectors 
R such that R.P = |Af.P| VP S 5, for some M G T.^ 
(Notation: TZ should not be confused with M, the set of 
real numbers.) 

B. Multi-partite systems 

So far, the framework described is similar to that used 
by Hardy as a starting point for his derivation of quantum 
theory (although I have been more explicit about treat- 
ing transformations and measurements in a unified man- 
ner). Hardy narrows things down with various axioms. 
Rather than adopt any of Hardy's axioms, however, I in- 
troduce a small number of non-trivial assumptions that 
concern how systems combine to make multi-partite sys- 
tems. One reason for this is that most questions of infor- 
mation processing do not make sense without some no- 
tion of systems being composed of separate subsystems. 



^ Recall that in quantum theory, an effect E is a, positive operator 
such that < -E < 1. -R vectors are essentially a generalization 
of the effects to our framework. In the usual quantum formalism, 
an effect can represent a yes / no measurement on a quantum state 
p, with the probability of the yes outcome given by Tr(_Ep). A 
set of effects Ei such that Ei = 7, where I is the identity, is 
a positive operator-valued (POV) decomposition of the identity, 
and corresponds to a POV measurement. 



From these assumptions I derive that systems combine 
according to a tensor product rule. This is of indepen- 
dent interest since it sheds light on where this rule comes 
from in quantum theory. 

From hereon, the notion of a type of system is broad- 
ened. Thus multi-partite systems can come in different 
types, where a particular type of multi-partite system is 
composed of ua single systems of type A, ub single sys- 
tems of type B, and so on. In all of this section, a system 
A or B refers to a system of some specific type, that may 
itself be a composite system. 

Begin with the idea that, given a system A, it is possi- 
ble to identify some operations as operations on system A 
alone and that, in particular, the fiducial measurements 
for system A are operations on system A alone. (Without 
this, one might say that we have no business speaking of 
separate systems in the first place.) 

Assumption 4 Local Operations Commute. Consider a 
joint system composed of systems A and B. Suppose 
that an operation is performed on system A alone with 
outcome oa and an operation on system B alone with 
outcome ob. The final unnormalized state of the joint 
system does not depend on the order in which the opera- 
tions were performed. In particular, this implies that the 
joint probability of getting outcomes oa and ob does not 
depend on the ordering of the operations. 

This assumption means that operations can be regarded 
as performed simultaneously on systems A and B without 
ambiguity. It also implies 

Corollary 1 The No-Signalling Principle. If an opera- 
tion is performed on system A, it is not possible to get in- 
formation about which operation was performed by mea- 
suring system B. 

The proof of the corollary is straightforward. Suppose 
that an operation is performed on system A first, followed 
by an operation on system B. Whichever operation was 
performed on system A, the marginal probability of out- 
come ob is equal to the probability of ob in the case that 
the operation on system B came first. The probability 
of Ob is thus independent of the operation on system A. 

Assumption 5 The Clobal State Assumption. The 
global state of a multi-partite system can be completely 
determined by specifying joint probabilities of outcomes 
for fiducial measurements performed simultaneously on 
each subsystem. 

Note that while the global state assumption is satisfied 
in quantum theory and in classical probability theory, it 
need not be satisfied in an arbitrary theory. For example, 
it is not true in the case of quantum theory defined over 
a real Hilbert space 30, 31, 32]. So this assumption has 
significant content."* 



^ Arguably, this is not the case for the assumption that local oper- 
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It follows from these two assumptions that the global 
state of a multi-partite system can be written in the form 
of a vector of joint probabilities. For example, for a bi- 
partite system AB, it will look like this: 



/ P(a = 1,6= 1\X = l,y = 1) \ 
P{a =l,b = 2\X = 1,Y = 1] 



P 



AB 




(14) 



P{a — i,b — j\X — k,Y — I) is the joint probability 
of getting outcomes i and j when fiducial measurements 
k and I are performed on the two subsystems. The no- 
signalling principle implies 



J2Pia^hb = j\X = k,Y = 1)^ 
j 

^P{a^i,b^ j'\X ^k,Y ^ I') Vz, fc, Z, Z', 



(15) 



^P(a = i,b = j\X = k,Y ^l) = 

i 

P{a = t',b = j\X = k',Y = I) Vj, fc, k', I. 



The reduced state for system A (analogous to the re- 
duced state in quantum theory, or marginal probabilities 



ations commute, which may be regarded as part of the definition 
of what we mean by an operation being on system A alone. Not 
wishing to be dogmatic on this point, I have listed this prin- 
ciple with the other assumptions. We should distinguish, how- 
ever, the implied no-signalling principle from the impossibility of 
super- luminal signalling, which is a contingent fact that as far 
as we know is true in our universe. To see the difference, con- 
sider that in the non-relativistic quantum mechanics of particles, 
the no-signalling principle is valid, yet super-luminal signalling is 
possible. In the present framework, the impossibility of superlu- 
minal signalling would imply an upper bound on the velocity of 
systems and that Alice cannot carry out an operation on Bob's 
system if she is spacelike separated from it. But I shall not use 
such notions, or indeed any notion of spacetime structure. 



in classical probability theory) is given by 



P' 




(17) 



where 



pia = i\x = j) = J2 = ^ = ''\^ = j,y= /)■ 

(18) 

Here, a and X are the outcome and fiducial measurement 
for the system whose reduced state is defined, and b and 
Y are the outcome and fiducial measurement for the other 
system. The no-signalling conditions of Eqs. ([TS]), (fTB]) 
ensure that the sum on the right is independent of the 
choice of j'. 

As seen in the last section, a particular theory specifies 
a set S of allowed states for each type of system. This 
applies also for each type of multi-partite system. There 
is, however, a constraint. 

Constraint 2 Suppose that P^^ e S^^ , where S^^ is 
the set of allowed states for the joint system. Suppose 
that P^ is the reduced state for system A corresponding 
to P^^ . Then P^ £ , where is the set of allowed 
states for system A. 

That systems combine according to a tensor product 
rule is asserted by the following three theorems. Proofs 
are in Appendix [B] 

(16) 

Theorem 1 Denote the vector spaces containing the vec- 



tors P^^, P^, and P^ by V 
tively. Then one can identify 



AB 



id 



respec- 



yAB ^ yA 



Theorem 2 Any P^" £ S 



-AS 



can be written 



P 



AB 



pf 



(19) 



with the r, real, Pf" £ and Pf £ . Both Pf" and 
Pf can be taken to be normalized and pure. 

Theorem 3 Consider a transformation on system A 
alone defined by 



P' 



p'A ^ ^^jA_p, 



The transformation of the joint system is given by 



P 



AB 



AB 
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Recall that transformations include probabilistic trans- 
formations that decrease the normalization of the state. 
Thus an immediate corollary of Theorem [3] is 

Corollary 2 If a measurement is performed on system 
A alone, with state , the probability of a particular 
outcome is given by 

R.P^ ^{R®I).P'^^. (20) 

Here, I is a vector representing the identity measure- 
ment, that is I.P^ = \P^\ yP^ e S^. The way things 
are set up, I is not unique but can always be taken to be 
(1,...,1|0,...,0|0,...,0|---). 

Much follows from these theorems and corollary. 

Collapsed states. Suppose that an operation is per- 
formed on a system A in a state P^. Suppose that 
the operation has outcomes i such that the final nor- 
malized state conditioned on outcome i is given by P/^ = 
MiP^/|MiP^|. The change in the state of system A 
is analogous to the quantum mechanical collapse of the 
state vector. If systems A and B begin in some joint state 
P^^ , and a measurement is performed on system A, then 
the final state of system B, conditioned on a particular 
outcome for the measurement, is also unambiguously de- 
termined. Thus this "collapse" is also well-defined "at 
a distance". Typically, similar questions of interpreta- 
tion arise in theories in this framework as do in quantum 
theory. Is this collapse a real process? A change in an 
agent's degrees of belief following her measurement? And 
so on. 

Entanglement and nonlocality. In Theorem [21 a joint 
state of a system AB is written as a linear sum of direct 
product states. Note that the theorem does not assert 
that a joint state of AB can be written as a convex com- 
bination of direct product states. In general, there will 
be joint states that cannot be written in this form. These 
are the entangled states of the theory. Entanglement is 
distinct from nonlocality, where the latter means viola- 
tion of a Bell inequality. Thus i) there are theories such 
as classical theories that have no entanglement or nonlo- 
cality, ii) there may be theories that have entanglement 
but no nonlocality, and iii) there are theories, such as 
quantum theory and GNST developed below, that have 
both entanglement and nonlocality, although these may 
not coincide.^ 



^ It is clear that entanglement is necessary for nonlocality. But in 
quantum theory there are entangled mixed states that are local 
|33l |34| , hence entanglement is not sufficient for nonlocality. In 
GNST, on the other hand, entanglement and nonlocality do co- 
incide. This is because if one can write down a local model for a 
particular state in GNST, then the model will itself define a con- 
vex decomposition of that state into product states allowed by 
the theory. This is not true in quantum theory because arbitrary 
local models can employ probability assignments not correspond- 
ing to any quantum state. 



Multi-partite systems. The state of a multi-partite sys- 
tem can be written as a vector pAB...z ^ Va Vb 
■ ■ ■ ® Vz- This vector can be written as a linear sum 
of direct product states Yli '''iPf^ ® PP ® ■ ■ ■ ® P^ ^ with 
ri S R, P^ G S^, and so on. A transformation on sys- 
tem A alone takes the form M ^ I (E) ■ ■ ■ (S) I, and similarly 
for transformations on B, . . . , Z alone. These extensions 
of the above theorems follow, since those theorems were 
stated for arbitrary bipartite systems AB and included 
the fact that A and B may themselves be composite. 

Finally, recall that a theory, in addition to specifying 
the set S of allowed states for each type of system, must 
also specify the set T of allowed transformations. 

Definition 1 A transformation on system A is well- 
defined if {Mf- /).P^^ e whenever P^^ e S'^'^ , 
for all types of system B . 

This definition corresponds to the fact that in quantum 
theory, allowed transformations must be completely pos- 
itive maps (and not, e.g., merely positive maps). An 
obvious constraint is 

Constraint 3 For each type of system, all transforma- 
tions G T must be well-defined. 

A natural assumption is 

Assumption 6 If P^ e S"^ and P^ e , then (g) 
pB ^gAB 

A final assumption that is convenient is 

Assumption 7 A theory first specifies a set S of allowed 
states for each type of system. All transformations that 
are well-defined are then allowed transformations. 

This assumption is indeed satisfied by all the theories 
considered below, including classical theories, quantum 
theory, GNST, and GLT. It is nice because it means that 
a theory is completely specified once the allowed types 
of system are specified, along with the set S of allowed 
states for each type. In this case. Assumption [7] defines 
the set T. The way things are set up, each of the sets 
O, M. and TZ is in turn defined by T. Assumption [7] also 
ensures that certain other obvious constraints hold that 
do not then need to be stated separately. For example, 
it imphes that if M e T and N e T, then M.N e T. 
Along with Constraint [31 it imphes that if g T^, 
then (g) I^ e T^^ . Finally, Assumption [71 along 
with Assumption [5] and Constraint [21 implies that if a 
procedure consists in introducing an ancilla to system A, 
performing some joint transformation on A and ancilla 
and then throwing away the ancilla, then the correspond- 
ing transformation on A alone is € . 

The fact that transformations have to be well-defined 
yields one of the main insights of this work. There is 
a rich interplay between the set of allowed states, the 
allowed dynamics, and the information processing possi- 
bilities that a theory offers. For example, if a theory is 



8 



modified by enlarging the set of allowed states (adding 
super-correlated states to quantum theory, perhaps), one 
might naively think that this must increase the informa- 
tion processing possibilities. However, enlarging the set 
of allowed states may well have the effect of decreasing 
the set of allowed transformations, in which case the ef- 
fect may well be the opposite. 



IV. SOME DIFFERENT THEORIES 

It is useful to see examples of theories that can be de- 
scribed in this framework. The most important are clas- 
sical theories and quantum theory. Two others are GLT 
and GNST. All of these theories satisfy Assumption [3 
which means that each is completely determined by the 
set S of allowed states for each type of system. 



III. A BRIEF NOTE ON AMBIGUITIES 

There are a couple of points that deserve a mention 
here in case it be thought that they cause problems (this 
section may perhaps be omitted on a first reading). First, 
two theories may be identical in their structure, that is 
the sets S, T, O, TZ and A4 of allowed states, transforma- 
tions, operations, outcomes and measurements, could be 
mathematically identical in each theory, yet the theories 
be different physically because the mathematical objects 
are assigned to different physical objects. For example, 
a particular preparation device could be associated with 
one state in one theory and another state in the other 
theory. 

Second, one theory could be made to look different, 
that is have different sets S, T, O, TZ and Ai, simply be- 
cause different measurement devices are chosen to corre- 
spond to fiducial measurements. Thus in quantum theory 
the state of a qubit could be specified by the probabil- 
ities for the outcomes of spin measurements in the x, y 
and z directions. The set S is then a sphere. Equally, 
the quantum state could be specified by the probabili- 
ties for measurements in the x, y and n directions, where 
n = l/^/2{x + z). In this case, the set S is an non- 
sphcrical ellipsoid. A fiducial set may even have different 
numbers of measurements and outcomes. For example, 
any quantum state can be expressed by giving the proba- 
bilities of the outcomes for a single, informationally com- 
plete POV measurement [l5j |. The important thing is 
that the outcomes of the fiducial measurements in the 
new formulation are represented by linearly independent 
vectors in the old formulation. Thus there is an invertible 
matrix iV such that the two formulations are related by 
P' = N.P, R'^ = R'^.N-\ and M' = N.M.N'^. The 
theory makes the same predictions since R'.P' — R.P, 
and so on. 

The first of these points means that in order to com- 
pare the predictions of two theories, one has to know 
which physical devices different preparations and oper- 
ations correspond to. But being primarily interested in 
the information processing properties of theories, we can 
ignore this issue and concentrate on the structure of the 
theories. The second point ensures that we can do this 
unambiguously. The structure of a theory and the con- 
clusions drawn for information processing do not depend 
on which measurements are chosen for the fiducial set. 



A. Classical theories 

Suppose that for some particular type of system, the 
fiducial set can be chosen as a single measurement with d 
outcomes, and that any (possibly sub-normalized) prob- 
ability distribution over these outcomes corresponds to a 
(possibly sub-normalized) allowed state. In this case, the 
system is classical. A classical theory is one for which all 
systems are classical. The most comprehensive classical 
theory is the one for which there is a type of system for 
every d > 1. For a classical system, 5 is a simplex. Pure 
states are represented by vectors e^, with a 1 for the zth 
component and Os elsewhere. The state of a bipartite 
system of two classical systems is also represented by a 
vector from a probability simplex, the entries being the 
joint probabilities for outcomes i and j when the fiducial 
measurement is performed on each system. An allowed 
transformation M must map a pure state ei to another 
allowed state. It is easy to show that each entry of M 
must be positive, and the sum of each column must be 
> and < 1. In the case that M preserves normalization, 
it is a stochastic matrix.^ The set 7?, is a hypercube. 

Consider, for example, an ordinary die which can exist 
in six different deterministic states. The P vector is six 
dimensional and gives the probabilities that the die's up- 
permost face is 1, 2, . . . , 6. An example of a measurement 
is one that asks, is the uppermost face 1 or 2? The yes 
outcome corresponds to the vector R = (1,1,0,0,0,0). 
The state of two dice, A and B, can be written as a 36- 
dimensional vector, whose entries are the probabilities 
for the uppermost faces being 11, 12, ... , 66. 

Suppose that the reduced states of the two dice are 
given by P^ = pB ^ 1/6(1,1,1,1,1,1). One possible 
joint state compatible with P^ and P^ is a direct prod- 
uct 



P^B = P^®pB 



1 

36 



/l\ 



VI/ 



^ In this work, a stochastic matrix is a not necessarily square ma- 
trix, with positive entries, whose columns each sum to 1. 
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This corresponds to the two dice being uncorrelated. But 
another possible joint state with the same reduced states 
is 

pAB ^ I 1/6 i = i, 

1 otherwise. 

This corresponds to perfect correlation and obviously 
cannot be written as a direct product. Of course there is 
no entanglement or nonlocality in this theory.^ 



B. Quantum Theory (in finite dimensions) 

Quantum theory only allows certain types of system. 
For example, there are no systems that can be described 
with two fiducial measurements each with two outcomes. 
A qubit can be described by three fiducial measurements 
with two outcomes, e.g., spin measurements in the a;, y 
and z directions. Once a set of fiducial measurements is 
chosen quantum theory tells us what the allowed states 
P are. In the simple case of a qubit, the set of normalized 
states is the Bloch sphere. In the case of higher dimen- 
sional quantum systems it does not appear to be so easily 
characterized (except via the usual quantum formalism 
of course). The transformations that are well-defined, 
in the sense of Definition [U correspond precisely to the 
linear completely positive maps. It is usually assumed 
that any such map corresponds to a physically possible 
operation, thus Assumption [7] is satisfied. Any set of Ri 
with < R^.P <l\/i\/P eS and J2^ R^-P = iyP e S 
is a positive operator-valued measurement in the usual 
formalism. 

There is nothing new in the fact that quantum states 
can be represented as real vectors and transformations 
as matrices acting on these vectors. It is well known 
that Hermitian operators in d dimensions form a cP- 
dimensional real vector space, with an inner product 
given by Tt(AB). Linear completely positive maps cor- 
respond to cf X (P matrices acting on this space. But the 
present framework does not correspond exactly to this 
representation (e.g., it is possible that P.P > 1), so it is 



''' There is nothing difficult in the preceding remarks. But part 
of the aim of Section III Bl is to deflate the significance of the 
tensor product rule for combining systems in quantum theory. 
Thus it is useful to note that a similar rule arises quite naturally 
in what is essentially classical probability theory. The quantum 
tensor product rule does not have to be regarded, as it frequently 
is, as a mysterious replacement for the Cartesian product used 
in combining deterministic classical states. If quantum states 
(even pure ones) are more analogous to probabilistic classical 
states than anything else - in other words if some version of the 
epistemic interpretation of the quantum state is correct - then a 
tensor product rule is exactly what one would expect. Thus one 
way of viewing the tensor product is as evidence for the epistemic 
interpretation. 



useful to see an example. A qubit whose state is spin up 
in the z-direction can be written 



P = 



( P{] 
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where P(t is the probability of obtaining spin up 
when measuring in the x direction, and so on. It can 
now be verified that if, for example, spin is measured in 
the n-direction, where n = 1/V2{x + z), then the up 
outcome corresponds to the vector 



R = 



( ' 


1 1 




V2\/2' 2^2 


2' 2 


2a/2' 2^2/ 



This vector is not unique. Any vector i?' ~ R + C, where 
CP = VP G S, represents the same measurement out- 
come. The unitary transformation usually written as the 
Pauli matrix CTz would correspond to 
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C. Generalized Non-Signalling Theory 

Suppose that for any pair n, fc > 1, there is a corre- 
sponding type of single system, whose state can be de- 
scribed by a set of n fiducial measurements, each with 
k outcomes. Call this an (n, k) system.^ For a single 
system, allow any state P, provided the entries of P are 
between and 1 and Eq. ([3]) is satisfied. For multi-partite 
systems, allow any state P, provided entries are between 
and 1, Eq. ([3]) is satisfied, and the no-signalling condi- 
tions of Eqs. P^ . p^ are satisfied for all bipartite split- 
tings. The resulting theory is Generalized Non-Signalling 
Theory. 

It is useful to see some examples of systems in this 
theory. The simplest kind of single system has two binary 
fiducial measurements. This type of system plays a role 
somewhat analogous to that of a classical bit or a qubit. 



A more general theory would include further types of system 
with different numbers of outcomes for different fiducial mea- 
surements. I ignore this possibility. I do not believe that it 
would change much beyond introducing uninteresting complica- 
tions into some of the proofs. 
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(2,1) 



(2,2) 



(2,1) 



(2,2) 




(1,1) 



(1,2) 



FIG. 1: The space of normalized states for a gbit in GNST 
corresponds to the square. If the measurements X — 1 and 
X = 2 are associated with spin measurements in the z and x 
directions, then the space of states for a quantum mechanical 
qubit corresponds to the circle. 



(2,1) 



(2,2) 



(1,1) 



(2,1) 





(1,1) (1,2) (1,2) 

FIG. 2: An allowed transformation. 



(2,2) 



so from hereon it is called a gbit (for generalized bit). The 
space of possible normalized states is shown in Fig. [1] 
There are four pure states, which correspond to the four 
ways of assigning definite outcomes to the X = 1 and 
X = 2 fiducial measurements. In the figure, these are 
represented by (1, 1), (1, 2), (2, 1), and (2, 2), where (1, 2), 
for example, is the state which returns a = 1 for the X = 
1 measurement and a = 2 for the X = 2 measurement, 
and is also represented by P = (1,0|0, 1). Thus pure 
states of single systems have a definite outcome for each 
fiducial measurement - there is no uncertainty principle. 
As noted in the figure, if the measurements X = 1 and 
X — 2 are associated with spin measurements in the z 
and X directions, then we can include possible states of 
a qubit in the diagram, and these form a circle inscribed 
in the square. Qubits of course have an extra degree 
of freedom, namely spin in the y direction. For (3, 2) 
systems the space of states is a cube, with an inscribed 
sphere (the Bloch sphere) representing quantum states. 

Consider the possible transformations of a gbit (for 
simplicity, restrict attention to those that preserve nor- 
malization). An allowed transformation will transform 
the square in such a manner that all points remain in the 
square, otherwise the transformation is not well-defined 
in the sense of Definition [T] The transformations of 
Figs, m and [3] are allowed. But the transformation of 
Fig. [5] is not allowed. Transformations in quantum the- 
ory are less restricted because the requirement is only 





(1,1) (1,2) 

FIG. 3: Another allowed transformation. 
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FIG. 4: A forbidden transformation. 



that points in the circle are transformed into points in 
the circle. So a rotation of 7r/4, as in Fig. IH is fine, and 
indeed corresponds to the well known tt/8 gate. 

It begins to look as if the dynamics of single systems 
in GNST is rather simple. Indeed, this is the case. Sec- 
tion IVII contains a theorem that states that for single 
systems in GNST, allowed transformations correspond 
essentially to relabellings of measurements and outcomes, 
and probabilistic combinations thereof. Thus in a sense, 
the dynamics is classical. Despite this, the dynamics 
does contain possibilities that quantum dynamics does 
not. Consider a (3, 2) system, whose space of normal- 
ized states is a cube, with the quantum Bloch sphere in- 
scribed. A possible transformation is a reflection in the 
center of the sphere. This corresponds to the so called 
Universal NOT gate of quantum theory, which is not an 
allowed transformation since it is not completely positive. 

The multi-partite states of GNST are noteworthy in 
that they include states that are more nonlocal than 
quantum theory allows. For example, given a bipartite 
system of two gbits, the following is a possible state. 

11 

XY ^ 12 \ ^ P(a=l,b=l\XY) 

21 J (^1) 
P{a = 2,b = 2\XY) = 1/2, 
XY = 22-^ P{a = 1,6= 2\XY) = 



P[a = 2,b= 1\XY) = 1/2. 



(22) 



The correlations obtained from fiducial measurements on 
this state return a value of 4 for the left hand side of the 
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following inequality 

P(a = 6|ll) + P(a = 6|12) 

+ P(a = 6|21) + P(a ^ 6|22) < 3 ^ ' 

(this is the CHSH inequality [s^ written in a slightly 
different form than usual). These correlations cannot 
be obtained from measurements on any quantum state 
since by Tsirelson's theorem [36j . quantum states can 
only reach a maximum of 2 + 

Information processing in GNST is discussed in Sec- 
tion IVIII The theory's permissiveness with respect to 
states implies that some things can be achieved that are 
impossible in quantum theory. These include 1-2 obliv- 
ious transfer, van Dam's scheme for the easy solution 
of communication complexity problems [25j , and a kind 
of super-quantum memory. The restricted nature of the 
dynamics, however, implies that there is no teleportation 
or super-dense coding. The theorems of Section IVII give 
evidence that computation is no better than classical. 



D. Generalized Local Theory 

Suppose that, as in GNST, for any pair n, fc > 1, there 
is a corresponding type of system, whose state can be de- 
fined with n fiducial measurements with k outcomes. As 
in GNST, all P with entries between and 1 satisfying 
Eq. ([31) are allowed states. The only multi-partite states 
allowed, however, are those for which the fiducial mea- 
surements return local (non-Bell-violating) correlations. 
This defines Generalized Local Theory. 

As in GNST, the pure states of single systems in this 
theory are those that have a deterministic outcome for 
each fiducial measurement. Since multi-partite states are 
local with respect to fiducial measurements, the pure 
states of a multi-partite system are precisely those in 
which each subsystem is in a deterministic pure state. 
An arbitrary state of a multi-partite system is a convex 
mixture of these. It follows that no state in this theory 
can violate a Bell inequality, even if non-fiducial mea- 
surements are performed. Hence the name. 

GLT is more general than quantum theory in allowing 
arbitrary single system states, but more restricted in not 
allowing nonlocal states. As described in Section IVIII 
GLT allows 1-2 oblivious transfer. Computation in GLT, 
however, is efficiently simulable by a classical computer. 



These non-signalling super-quantum correlations were written 
down by Khalfi and Tsirelson |16|I , and were independently intro- 
duced by Popescu and Rohrlich Il7|l . Other examples of super- 
quantum correlations, involving more measurements or parties, 
are given in Ref. [iSll . The latter are also allowed in GNST. 



E. Other possibilities 

There are other possibilities that would be interesting 
to investigate. For example, 

1. A theory that is essentially quantum theory but 
with only separable states allowed. 

2. A theory in which the state of a single system must 
be a quantum state, but in which the state of a 
multi-partite system can be anything, as long as the 
no-signalling principle and the restriction that the 
reduced states for the individual subsystems must 
be quantum are satisfied. The latter idea has been 
investigated in Ref. [37], where it is shown, amongst 
other things, that Tsirelson's theorem still holds. 

V. GENERIC PROPERTIES OF THEORIES 

One of the reasons for introducing a framework en- 
compassing many different theories is that it is interest- 
ing to identify properties of theories that are generic, in 
the sense that they are shared by all or most theories 
in the framework. Some features, usually thought of as 
specifically quantum, are present in all theories in our 
framework except theories that are classical (in the sense 
of Section |IVA[) . Thus classical theories are very special! 
These features include the fact that mixed states do not 
always have a unique decomposition into pure states, and 
a no-go theorem for universal cloning. More exact state- 
ments of these claims are given in this section. Proofs are 
in Appendix [Cl It is tedious to write always all theories 
in the framework, so from hereon this is shortened to all 
theories, taking the assumptions of Section |lT] as read. 

First, 

Theorem 4 Suppose that for a particular type of system, 
every mixed state has a unique decomposition into pure 
states and 0. Then the system is classical. 

The next theorem concerns the disturbance of systems 
on measurement and is due in part to Howard Barnum 
and Alex Wilce [s^. Say that a transformation disturbs 
a state P if there is no constant c such that M.P = cP. 
This means that, conditioning on the outcome corre- 
sponding to this transformation, the state is no longer 
P. A transformation is non-disturbing if no pure state is 
disturbed and an operation {Mi} is non-disturbing if all 
Mi are non-disturbing. 

Theorem 5 For any system, let V be the vector space 
in which states are defined, and let Vs be the subspace 
spanned by S. Then Vs can be written as a direct sum, 
Vs — ®j Vi, where the Vi are subspaces ofVs, such that 

1. Every pure state P is contained in some Vi. 

2. A non- disturbing transformation is of the form 
M = ^^Cili, where < < 1, and li is the 
identity on Vi. 
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It follows that non-disturbing operations have the same 
outcome probabilities for pure states in the same T^, and 
thus cannot distinguish them. It is easy to show that a 
system is classical if and only if each Vi contains exactly 
one pure state. For a quantum system without super- 
selection rules, Vs cannot be further decomposed into 
a direct sum. Non-disturbing operations have the same 
outcome probabilities for all pure states, each transfor- 
mation being proportional to the identity on Vs- An 
example of such an operation would be to toss a coin, 
without interacting with the system at all, and to output 
the result. For a quantum system with superselection 
rules, pure states from the same sector are elements of 
the same Vi , and different sectors correspond to different 
V. 

Theorem [S] has implications for cloning. Cloning refers 
to the following procedure: 

1. Begin with a system yl in a pure state. Denote its 
state P. 

2. Introduce a system B of the same type, prepared in 
a standard state Q. The state of the joint system 
is P «) Q. 

3. A joint transformation M acts on the pair of sys- 
tems such that the final state is (M)(P Q) (x 

A deterministic universal cloning procedure always suc- 
ceeds and works on all pure states. It implies the exis- 
tence of a normalization-preserving M and a state Q such 
that {M){P ®Q)=P®Piov all pure P. A probabilis- 
tic universal cloning procedure is allowed to output a fail 
outcome, but conditioned on success, the final state must 
be P(g)P. There must be a non-zero probability of success 
for all pure states P. This type of cloning implies the ex- 
istence of a non-zero M such that {M){P (g) Q) = cP® P 
for all pure P, where c can vary with P, and < c < 1- 

Theorem 6 With the exception of classical systems, 
there is no probabilistic universal cloning procedure. 

This of course implies that with the exception of classi- 
cal systems, there is no deterministic universal cloning 
procedure. 

Theorems HI [5] and [5] apply even to classical theories if 
extended to mixed states. Thus there are mixed states 
with a non- unique decomposition into mixed states. All 
transformations disturb at least one mixed state unless 
they are proportional to the identity. -'^^ And cloning of 



This is not at all surprising if put into more prosaic terms. Con- 
sider that a die is in a state such that the probability of each 
face being uppermost is 1/6. Suppose that the die is measured, 
to find out which face is uppermost, and the value 1 found. 
Then, if it is assumed that the measurement operation was done 



classical mixed states is impossible. One possible in- 
terpretation of these remarks is as further evidence that 
quantum pure states are more akin to classical mixed 
states than classical pure states. 

There are many other questions concerning properties 
that are common to all theories, or all except classi- 
cal theories. In Ref. [43|, the quantum no-broadcasting 
theorem is generalized to arbitrary non-classical theo- 
ries within a framework closely related to the present 
one. It can also be shown that all theories in the frame- 
work have an infinite de Finetti theorem (4lj |. and that 
polynomially-sized computations in these theories can be 
simulated classically in polynomial space Features 
such as these can be regarded as arising solely from the 
assumptions that were made in setting up the framework. 



VI. DYNAMICS IN GNST AND GLT 

Part of the motivation of this work is to consider which 
features of a theory, in particular those features related 
to information processing, arise from which assumptions. 
It is particularly interesting if significant features, such 
as the no-cloning theorem above, arise from very mini- 
mal assumptions and are thus shared by a broad class of 
theories. Another part of the motivation is to investigate 
theories that are different from those we already know 
about. These theories need not even be empirically ade- 
quate; a compare and contrast exercise will still be useful 
to learn more about those theories that are empirically 
adequate. Thus the next two sections are devoted to a 
detailed investigation of GNST and GLT. 

In Section IIV CI the dynamics of a gbit was briefly 
discussed. There are four pure states of a gbit, cor- 
responding to the four ways of assigning definite out- 
comes to the two measurements. The space of normal- 
ized states is a square, with a normalization-preserving 
transformation being a linear transformation of this 
square. Let us consider more general types of system 
in GNST and GLT, but continue to focus on normal- 



in the most obvious way, the state after measurement is not 
1/6(1,1,1,1,1,1), but (1,0,0,0,0,0). Of course the measure- 
ment operation may be such that the die is recast after the out- 
come is obtained, resulting in a final state of 1/6(1, 1, 1, 1, 1, 1). 
But then an initial state of (1, 0, 0, 0, 0, 0) would be disturbed. 
Suppose that Alice prepares a die in one of two ways, each cor- 
responding to a probability distribution over the different faces. 
The first prepares, say, the state 1/12(6, 2, 1, 1, 1, 1) and the sec- 
ond the state 1/6(1, 1, 1, 1, 1, 1). The die is given to Bob who is 
required to perform a cloning operation. This means that Bob 
must prepare another die such that if Alice used the first prepa- 
ration, its state is 1/12(6, 2, 1, 1, 1, 1), and if she used the second, 
then 1/6(1,1,1,1,1,1). Furthermore, if the dice are measured 
after Bob's operation, the results must not be correlated. This 
last clause prevents Bob from using a device that simply reads 
the uppermost face of the die and prepares another in the same 
state. It is easy to see that even if Bob's cloning procedure is 
allowed to be probabilistic, he cannot do it. 
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ized systems and normalization-preserving transforma- 
tions, i.e., operations corresponding to a single matrix, 
{M}. For this section and the next, transformation 
means normalization-preserving transformation, with the 
investigation of probabilistic transformations left for fu- 
ture work. 

The space of normalized states of an (n, k) system is a 
polytope, the vertices corresponding to pure states. Pure 
states are of the form 

P= (0...1...0|0...1...0|...). 

Allowed transformations must take points in the polytope 
to points in the polytope. This condition is so restrictive 
that the following theorem holds. 

Theorem 7 Normalization-preserving transformations 
of single systems in GNST or GLT, thought of as active, 
correspond to passive transformations that simply relabel 
fiducial measurements and outcomes, or to convex com- 
binations of such. Equivalently, for a transformation of 
an (n, k) system, the matrix M representing the trans- 
formation can be written 
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where Mij is a kxk matrix, and where Mij = aij Sij , for 
Sij a stochastic matrix, < aij < 1, and J2j o^ij = 1- 

A useful pictorial representation of this theorem is given 
in Fig. [5l A related result is 

Theorem 8 The only measurements on single systems 
in GNST or GLT are fiducial measurements, possibly 
with outcomes relabelled, or correspond to convex com- 
binations of such. 

Theorem [S] is illustrated pictorially in Fig. [Sj The proofs 
of Theorems [7] and [8] are contained in Appendix |D] 

In the case of GNST, the following theorem holds for a 
bipartite system of two gbits, and suffices to characterize 
the normalization-preserving transformations of such a 
system. 

Theorem 9 Consider a system of two gbits in GNST, 
and suppose that a normalization-preserving transforma- 
tion is performed. Suppose that this transformation is 
followed by the fiducial measurements X, Y on the two 
subsystems, with outcomes a, b. The joint probability of 
obtaining outcomes a,b is equal to that obtained from a 
convex combination of procedures of the following kind. 
First, perform a fiducial measurement X' on one of the 
gbits, where X' may depend on X and Y. Denote the 
outcome a' . Then perform a fiducial measurement Y' on 
the other gbit, where Y' may depend on X,Y and on a'. 
Denote the outcome b' . The final outcome pair (a, 5) is 
a function of X , Y, a' and b' . 
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FIG. 5: Transformations of single systems in GNST and GLT 
can always be represented as the appending of classical cir- 
cuits as shown here, or as convex combinations of transfor- 
mations of this type. If a fiducial measurement X is per- 
formed on the transformed system, this can be thought of as 
performing fiducial measurement X' on the original system, 
where X' — F1{X) for some function Fl. When measure- 
ment X' is performed on the original system, outcome a' is 
obtained with some probability. The probability of obtaining 
outcome a for the measurement X on the transformed sys- 
tem is equal to the probability of obtaining an outcome a' 
such that a = F2{X, a'), for some function F2. 



Of course this theorem can also be expressed in terms 
of a formal constraint on the transformation matrix M, 
but in this case it is more complicated and less enlight- 
ening. Theorem [9] may also be understood pictorially, as 
in Fig. [71 

Theorem 10 In GNST, the only measurements on bi- 
partite systems comprised of two gbits correspond to con- 
vex combinations of procedures of the following kind. 
First, perform a fiducial measurement X on one of the 
gbits, obtaining an outcome a' . Then perform a fiducial 
measurement Y on the other gbit, where Y may be a func- 
tion of a' , obtaining an outcome b' . The final outcome is 
a function of a' and b' . 

Theorem [10] is illustrated in Fig. [51 The proof of The- 
orem [9l is given in Appendix [Dl It is an open ques- 
tion whether similar theorems hold for transformations 
and measurements on arbitrary multi-partite systems in 
GNST. It can be shown that in GLT, there definitely do 
exist possibilities for measurements and transformations 
on multi-partite systems that do not reduce to one of the 
forms presented in this section. 

The proofs in Appendix [P] also make clear the follow- 
ing. The most fine-grained measurements on single sys- 
tems in GNST or GLT can be represented by a set of vec- 
tors Ri , such that each Ri has one element between and 
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FIG. 6: Measurements on single systems in GNST or GLT 
can always be performed via a procedure of the type illus- 
trated here, or via a convex combination of such procedures. 
First, fiducial measurement X is performed and outcome a' 
is obtained. The outcome a of the complete measurement is 
then a = Fl{a'). This result applies to measurements with 
an arbitrary number of outcomes. 
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FIG. 7: In GNST, transformations on bipartite systems com- 
prised of two gbits can always be represented by the append- 
ing of classical circuits as shown here, or by a similar construc- 
tion inverted with respect to the two systems, or by a convex 
combination of such. For the construction shown here, this 
means that if fiducial measurements X, Y are performed on 
the transformed system, one may think of this as first per- 
forming a fiducial measurement X' on one half of the original 
system, where X' — F1{X,Y). This gives an outcome a'. 
Then, perform a fiducial measurement Y' on the other sub- 
system, where Y' = F2{X,Y,a'). This gives an outcome b' . 
The final outcome pair (a, b) is determined by a function F3 
of X, Y, a' and b' . 




FIG. 8: In GNST, measurements on bipartite systems of two 
gbits can always be carried out by a procedure like that il- 
lustrated here, by a similar procedure inverted with respect 
to the two subsystems, or by a convex combination of such. 
For the procedure shown here, this means that first, a fiducial 
measurement X is performed on one subsystem, and outcome 
a' is obtained. Then fiducial measurement Y' is performed on 
the other subsystem, where Y' = Fl{a'), and outcome b' is 
obtained. The outcome a of the complete measurement is 
given by a = F2{a',b'). This result applies to measurements 
with an arbitrary number of outcomes. 



1 and the rest 0. Such an Ri is analogous to an effect in 
quantum theory that is proportional to a 1-dimensional 
projector. A set of Ri is analogous to a non-degenerate 
projective measurement if each Ri is a basis vector (one 
element 1 and the rest 0) and = 1 VP G S. The 

corresponding measurement is simply a fiducial measure- 
ment, with an Ri for each outcome. It is then immediate 
that, at least with respect to these measurements, there is 
no Kochen-Specker theorem for single systems in GNST 
or GLT. Not only is it possible to assign definite out- 
comes to these measurements in a non-contextual fash- 
ion, but each such assignment is in fact an allowed state 
of the theory. Nonetheless, both GNST and GLT exhibit 
a different kind of contextuality, introduced by Spekkens 
[43i] and termed preparation contextuality. Readers are 
referred to Ref. 43] for discussion of preparation contex- 
tuality. Given the definition, the proofs for GNST and 
GLT are obvious. 



VII. INFORMATION PROCESSING 

Using the results obtained for dynamics in GNST and 
GLT, the information processing possibilities of each the- 
ory can be investigated. Rather than attempt something 
like a general theory of information, this section contains 
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remarks concerning some obvious tasks. Note that there 
has already been some work investigating the information 
processing properties of PR boxes, considered merely as 
abstract correlations, van Dam has shown that they are 
very powerful for communication complexity problems 
[2^, and this result has recently been extended to noisy 
PR boxes in Ref . [26'] . Others have claimed to show how 
to do oblivious transfer [19j and bit commitment | 2lil us- 
ing PR boxes. However, as pointed out in Ref. the 
fact that these latter works consider PR boxes only as 
abstract correlations means that they make assumptions 
that may not hold in any theory that allows PR boxes. 
In general, a theory with well defined dynamics is needed 
before cryptography, or indeed other types of informa- 
tion processing, such as computation, can be discussed. 
GNST is such a theory. 

The first results concern teleportation and super-dense 
coding (the quantum versions of these tasks were intro- 
duced in Refs. [44] and [11]). The natural analogue of 
a quantum mechanical singlet in the GNST is a state 
which, when fiducial measurements are performed, pro- 
duces the PR box correlations: 



XY^ 12 > ^ P(a = 1,6= l]Xr) 
21 J 

P{a = 2,b^2\XY) = 1/2, 
22 -> P{a = 1,6 = 2\XY) = 

P(a = 2,6= 1\XY) = 1/2. 

It can be shown that these correlations represent a pure 
state - that is a vertex of the polytope of states for two 
gbits. Further, all vertices of this polytope are either 
local deterministic correlations (product pure states) or 
are equivalent to the PR box under local transformations 

m- 

Theorem 11 It is impossible to teleport an unknown 
gbit using a single shared PR box. 

Proof. This follows easily from Theorem 1101 In order 
to teleport an unknown gbit, Alice must perform some 
operation or sequence of operations on the gbit and her 
half of the shared PR box. Without loss of generality, 
whatever she does may be represented as a single joint 
measurement, with m outcomes, on the two subsystems. 
But this measurement can be represented as a convex 
combination of procedures like that of Fig. [H Such a 
procedure will always begin, either by measuring X = 1 
or X = 2 on the gbit, or by measuring the PR box. In 
the former case, no information is gained about the value 



One such assumption that as far as I know has not been pointed 
out is that the shared boxes are trusted to behave like PR boxes 
by both parties. But one may reasonably ask where did they 
come from? By whom were they distributed? 



for the other measurement on the gbit and teleportation 
cannot possibly succeed on all pure states. In the latter 
case, the shared PR box collapses into a product state 
which cannot achieve teleportation. □ 

Theorem 12 A single shared PR box cannot be used for 
super-dense coding. 

Proof. This follows from Theorems [7] and [TOl Super- 
dense coding would require that there are four different 
operations that Alice can perform on her gbit such that, 
when it is sent to Bob, he can determine unambiguously 
which was performed by a joint measurement on the two 
gbits now in his possession. It is easy to see that this is 
not possible. □ 



A. Cryptography 

Theorem 13 In GNST, key distribution is possible. 

Proof. Key distribution can be achieved in GNST using 
an Ekert-style protocol [il] , in which Alice and Bob first 
share n pairs of gbits, with each pair in the PR box state. 
They then test some of their shared systems, to make 
sure that they really are PR box states, i.e., that they 
have not been disturbed en route by an eavesdropper. 
Finally, they measure each remaining gbit pair, using the 
fiducial measurements X = 1 and Y = 1. Assuming 
that they share perfect PR box states, their measurement 
outcomes will be perfectly correlated and can be used as 
a secret key. This protocol is secure because PR box 
states have a property of being monogamous, much as 
the entanglement of a singlet is monogamous in quantum 
theory. Thus consider a tripartite system shared between 
Alice, Bob and Eve. If Alice's and Bob's reduced state is 
the PR box state Ppj^ , then the global state must be of 
the form Ppj^ P^ ■ The outcome of any measurement 
performed by Eve is uncorrelated with Alice's and Bob's 
outcomes. The fact that the PR box correlations are 
monogamous was shown in Ref. jl^ . □ 
Recall Theorem [5l which implies that except for clas- 
sical systems, there are pure states (lying in the same 
subspace Vi), which cannot be distinguished by non- 
disturbing operations. This motivates 

Conjecture 1 In any non-classical theory, secure key 
distribution is possible. 

Finally, 

Theorem 14 1-2 oblivious transfer can be implemented 
securely in both GNST and GLT. 

Proof. In a 1-2 oblivious transfer (introduced in 
Ref. [11]), Alice must submit 2 bits to Bob in such a 
manner that Bob can choose to learn either one of the 
bits or the other, but not both. There is also a security 
requirement against Alice, who must not be able to learn 
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which of the bits Bob chose. That this task is impossi- 
ble to implement securely in quantum theory is shown in 
Ref. [iil. To implement this task in GNST or GLT, Alice 
sends a gbit to Bob, in a pure state, with the two bits 
encoded in the outcomes for the X — 1 measurement and 
the X = 2 measurement. Theorem |8] ensures that any 
strategy employed by Bob is equivalent to his measuring 
either X = 1 or X = 2, or to measuring X = I with some 
probability p and X = 2 with probability 1—p. Thus the 
protocol is secure against Bob. That it is secure against 
Alice follows from the fact that, by the no-signalling prin- 
ciple, she cannot determine which measurement Bob did. 
□ 

In classical cryptography, it is known that 1-2 oblivi- 
ous transfer is equivalent to oblivious transfer [50l |. and 
that either can be used to implement arbitrary secure dis- 
tributed computation [51]. In particular, either can be 
used to implement bit commitment, hence coin tossing. 
However, one cannot assume that the standard reduc- 
tions of classical cryptography hold in a different the- 
ory such as GLT, GNST or quantum theory. Thus it is 
open whether other two-party cryptographic tasks, such 
as oblivious transfer, bit commitment or coin tossing, can 
be implemented securely in GNST or GLT. 



or pure product, states, in which each system has a defi- 
nite outcome for each fiducial measurement. A classical 
simulation of the GLT computation works by storing, 
at any given time, a local deterministic state of the n 
systems. This requires an amount of memory linear in 
n, rather than the exponential amount needed to store 
a complete description of an arbitrary convex combina- 
tion. An allowed transformation T, acting on k systems, 
must take local deterministic states of the k systems to 
other allowed states of GLT, which in turn are convex 
combinations of local deterministic states: 

i 

where superscript LD indicates a local deterministic 
state. The classical computer simulating the GLT com- 
putation simply updates the stored state pi-D pLD 
with probability pi. When the final X = 1 measure- 
ments are performed, the stored local deterministic state 
will determine the classical computer's output. □ 
The computational power of GNST is at present un- 
clear. But it is known that it is very powerful for com- 
munication complexity problems. 



B. Computation 

For any of the theories in the framework, a natural 
model of computation may be defined, based on the clas- 
sical and quantum circuit models. I introduce this model 
only informally. A particular circuit is assumed to act on 
n systems, each of the same type, initially prepared in 
a product state corresponding to the problem input. In- 
stead of fc-bit or fc-qubit gates, there are transformations 
that act jointly on k systems. At the end of the compu- 
tation, the fiducial measurement X — 1 is performed on 
each system in order to obtain the output. For a par- 
ticular theory, it may not be the case that bipartite and 
single system transformations together are universal, as 
they are in classical and quantum theory. Thus transfor- 
mations that act jointly on k systems for any k > 2 are 
allowed. But for any circuit family C„, there must exist 
some finite k such that all transformations act on k sys- 
tems or fewer. In addition, it may not be the case that 
any particular type of system (such as a gbit) is universal 
for computation in a given theory. So one should keep 
in mind that circuits may act on other types of system. 
Finally, in order to define a notion of polynomial time, 
say, the usual caveats must be assumed. For example, it 
should be possible for a classical Turing machine to out- 
put a description of the ith circuit in time polynomial in 
i. 

Theorem 15 In GLT, any computation can be simu- 
lated efficiently by a probabilistic classical computer. 

Proof. In GLT, any allowed state of n systems can be 
written as a convex combination of local deterministic. 



Theorem 16 In GNST, bipartite communication com- 
plexity problems require only constant communication, 
provided the parties share sufficient PR boxes. 

Recall that in a bipartite communication complexity sce- 
nario, two separated parties each receive an input, and 
their task is to compute some joint function of their in- 
puts. Their goal is to minimize the amount of commu- 
nication, van Dam has shown that if the two parties 
have a supply of shared PR boxes, then any communi- 
cation complexity problem can be solved with only con- 
stant communication 25]. This result has recently been 
strengthened: it continues to hold even if the shared PR 
boxes are noisy, provided the amount of noise is not too 
great (26| . Contrast the situation in quantum theory, 
where the inner product problem is known to require n 
bits of communication to be solved exactly, even with 
unlimited shared singlets (s^ . 
Finally, 

Theorem 17 Super-quantum memory. In GNST, it is 
possible to store a 2^^ -bit string in only n gbits. Although 
the whole string cannot be recovered, it is possible to re- 
cover the ith bit without error. 

Proof. Suppose that the ith bit of the 2"-bit string 
we wish to store is given by /(ii, . . . , i„) e {0,1}, 
where ii . . . i„ is the binary representation of i. Let 
Xi, . . . ,Xn S {0, 1} be fiducial measurements on the n 
gbits and ai,...,a„ G {0,1} the outcomes. (It is eas- 
ier for this proof to let Xj and Oj take values in {0, 1} 
instead of in {1,2} as elsewhere.) To store the string. 
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prepare a state of n gbits such that 

P(ai, . . . , an\Xi, . . . , Xn) = 

I l/2"-i ai®---©a„ = /(Xi,...,X„) 
1 otherwise 

(24) 

where represents addition mod 2. In order to recover 
the ith bit of the stored string, simply perform the mea- 
surement Xj — ij on each gbit and sum the outcomes 
mod 2. One may check that the state of Eq. p4)) is an 
allowed state, since it is normalized and non-signalling. 
Note that it is indeed impossible to store a 2" bit string 
in only n qubits such that any bit may be recovered. 
Bounds on quantum memory are derived in Rcf. ^53| . □ 

VIII. DISCUSSION 

A. The framework 

The framework introduced allows investigation of the- 
ories different from either quantum or classical theories. 
The general idea is that quantum theory can be better 
understood by viewing it in a context of different possi- 
bilities. More specific motivations include: 

1. to understand the links between general physical 
principles and information processing; 

2. to stimulate the study of computation in models 
that are more general than quantum theory; 

3. to address Popescu's and Rohrlich's question of 
why quantum theory does not allow the PR box 
correlations; 

4. to shed light on the interpretive problems of quan- 
tum theory by viewing those in a more general con- 
text; 

5. to stimulate research into axioms for quantum the- 
ory. 

As regards single systems the framework is very gen- 
eral indeed. It should be emphasized in particular that 
linearity of transformations is not assumed, but is de- 
rived from the fact that the vector P is by definition 
a complete description of the system. ^■^ The most im- 
portant requirements are that local operations commute 



So what of nonlinear modifications of quantum mechanics? 
These modifications are nonlinear in the sense that they involve 
a nonlinear Schrodinger equation. In this case, the usual density 
matrix is no longer a complete description of a quantum system, 
since the evolution of a system will in general depend not only 
on the density matrix, but on the particular decomposition into 
pure states (assuming a proper mixture). If the description of 
the state is expanded until it is complete, then the action of the 



(Assumption [4]) , and the global state assumption (As- 
sumption [S]) , both involving the manner in which sepa- 
rate systems combine to make joint systems. These imply 
a tensor product rule. 

One of the interesting things to emerge from the frame- 
work is that certain features, usually thought of as specif- 
ically quantum, are possessed by all theories except clas- 
sical theories. These include the non-unique decompo- 
sition of mixed states into pure states, the existence of 
sets of pure states that cannot be distinguished with non- 
disturbing operations, and the impossibility of even prob- 
abilistic universal cloning. Thus rather than regard quan- 
tum theory as special for having these features, a better 
attitude may be to regard classical theories as special for 
not having them. 

How reasonable are Assumptions |4] and [5]/ Commu- 
tativity of local operations is arguably part of what it 
means to talk about separate systems. In a theory where 
it fails, any measurement or transformation is essentially 
a measurement or transformation on all systems at once. 
It is no longer obvious how to define a reasonable model 
of computation - how should resources be counted? The 
case for assuming the commutativity of local operations 
is also strengthened by the fact that in a spacetime frame- 
work, it can be independently motivated by special rel- 
ativity. It is slightly more difficult to regard the global 
state assumption as independently compelling. Thus an 
interesting direction in which to extend this work would 
be to generalize the framework further by dropping this 
assumption. 



B. The tensor product rule 

It is interesting to compare the derivation of the tensor 
product rule with that of Fuchs [l^ . Without going into 
too much detail, Fuchs assumes that local measurements 
on two separate systems, A and B, are represented by 
positive operator- valued measures on Hilbert spaces Ha 
and Hb ■ He derives a Gleason-like theorem [s^ [s^ [s^l 
which states that the joint state of the two systems can be 
represented by an operator on the tensor product Hilbert 
space Ha ® Hb, with joint probabilities for outcomes of 
local measurements given by the standard trace rule. 

As Fuchs acknowledges, the proof does not establish 
that the operator describing the joint state has to be pos- 
itive, but only that it has to be positive with respect to 
local measurements. A consistent theory that is not ruled 
out would allow the state to be negative with respect to 
some joint measurements (the Bell basis measurement, 
for example), but would not allow such measurements. 



dynamics on this new expanded state description will be linear. 
But such a theory will in general violate one or more of the other 
assumptions. A list of references on nonlinear quantum theories 
is given in Ref. [54| . and computation in this context is consid- 
ered in Ref. Q. 
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Furthermore, the assumption that local operations com- 
mute and the global state assumption are both implicit 
in Fuchs' analysis. Without the latter, the possibility 
remains that there are extra degrees of freedom, not ac- 
cessible via local measurements, that are not described 
by an operator on the tensor product Hilbert space. 

It follows that Fuchs' conclusion is not stronger than 
the tensor product rule derived in this paper. The latter 
may be regarded as a generalization of Fuchs' proof to the 
case in which the subsystems A and B are not necessarily 
quantum. 



C. Information theory, GNST and GLT 

In addition to describing general properties of the 
framework, I investigated in detail two particular the- 
ories, GNST and GLT. I focussed on the information 
processing possibilities in these theories. One of the 
most interesting things to have emerged is that there is 
a trade-off between the states of a theory and the al- 
lowed dynamics. This arises for the simple reason that 
an allowed transformation must take allowed states into 
allowed states. Thus the dynamics of both GNST and 
GLT is very simple for single systems. In GNST, a simi- 
lar result holds for the simplest kind of bipartite system. 
The surprising consequence is that GNST is less power- 
ful than quantum theory in many ways, despite including 
super-quantum correlations. For example, teleportation 
and super-dense coding are impossible. It is already clear 
that computation in GLT can be simulated efhciently 
classically, while the computational power of GNST re- 
mains open. Another open question is whether secure bit 
commitment is possible in either theory. Despite these 
remarks, it is surprising how many features of quantum 
theory have analogues in GNST. These obviously include 
the generic features demonstrated in Section [Cl along 
with entanglement and nonlocality. But they also in- 
clude things I have not discussed in detail, such as the 
distinction between sharp and unsharp measurements, 
and preparation contextuality. (Other authors have also 
found features of quantum theory reproduced in other 
contexts. Masanes et al. j3l show that various features, 
including a no-cloning theorem, are present in all theo- 
ries that are nonlocal and non-signalling. Spekkens has 
introduced a toy theory that contains a remarkably wide 
range of quantum phenomena [Tlj . although note that 
this theory is not contained in our framework as it does 
not allow arbitrary convex combinations of states.) 

As mentioned above, one of the motivations of this 
work is to stimulate the study of computation in mod- 
els that are more general than quantum theory. Some 
authors have already considered computation in non- 
standard theories. However, these theories are often 
modifications of quantum theory that appear to have 
both unphysical consequences and immense computa- 
tional power. It is suspected that quantum theory with 
a nonlinear Schrodinger equation is very powerful, en- 



abling the solution of NP-complete problems in polyno- 
mial time, for example.^* Aaronson has considered vari- 
ous modifications of quantum theory, including a model 
that assumes the ability to postselect measurement out- 
comes, and a hidden variable model in which the history 
of hidden states can be read out by the observer [sl, [^. 
Various authors have considered classical and quantum 
computation in the presence of closed timelike curves 
0, Q . Most recently, Aaronson and Watrous have shown 
that BQP with closed timelike curves is equivalent to 
PSPACE 0. The framework introduced in this paper is 
the natural place to investigate computation in theories 
that are different from quantum theory, yet not obviously 
physically unreasonable or immensely powerful. I suggest 
that NP-complete problems cannot be solved efficiently 
by any theory in the framework. I also raise the following 

Conjecture 2 A quantum computer can simulate com- 
putation in any other theory in the framework with at 
most polynomial overhead. 

The intuition behind this is that quantum theory achieves 
in some sense an optimal balance of allowed states and 
dynamics. 



D. Interpretation 

On the face of it, many theories that can be written 
down in the present framework have similar interpretive 
issues as quantum theory, if one tries to understand them 
in a way that goes beyond the purely operational. Con- 
sider a universe in which some theory other than quan- 
tum or classical (GNST perhaps) is verified in laboratory 
experiments. The denizens of such a universe would be 
having debates in many ways similar to the debates that 
surround quantum theory. Is a pure state better under- 
stood as a complete description of individual reality, as 
representing an ensemble, or as representing the degrees 
of belief of some agent? 

Suppose that the inhabitants of this universe attempt 
to extend the theory to include a description of the mea- 
suring apparatus, and of the interaction between system 
and apparatus. This is always possible in classical and 
quantum theory. In quantum theory, this fact is ex- 
pressed in the idea that the Heisenberg cut can be moved 
upwards indefinitely. Are classical and quantum theories 
special in this regard, or can this be done in any theory? 

Even when the inhabitants succeed in constructing a 
measurement theory along these lines, it is plausible that 
many theories will have a measurement problem. In these 



In Ref . Q , it is claimed that nonlinear quantum theory can solve 
NP-complete and even #P-complete problems efficiently. Aaron- 
son complains Q that in this particular case it is difficult to 
evaluate whether exponential precision is required. 
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theories, the system and apparatus are typically in some 
entangled state after interaction. Some inhabitants may 
suggest hidden variables or some kind of collapse dynam- 
ics. Does any theory admit an Everettian interpretation, 
or is there a special feature of quantum theory that is 
necessary for this to work? 

I won't discuss these issues any further. I have raised 
them hoping that considering interpretive issues in a 
framework more general than quantum theory might give 
a new lease of life to the quantum debates. 

E. Axioms 

Aside from Hardy's derivation [T3| , what different ways 
are there of uniquely identifying quantum theory from 
the other theories in the framework by adding as few ex- 
tra assumptions as possible? Several have pushed the 
idea that a quantum state is best understood as a sum- 
mary of an agent's degrees of belief about the outcomes 
of future measurements on a system (32l . Issl . [59| . From 
this standpoint, Fuchs has argued that the formalism of 
quantum theory should be understood as a constraint 
on these degrees of belief, hopefully to be derived via a 
small number of postulates, along with an argument that 
any rational agent must accept [isj . Spekkens has also 
argued for an epistemic constraint as a foundational prin- 
ciple for quantum theory, although for Spekkens, beliefs 

Whether they succeed in deriving the full structure of quantum 
theory is debatable. But they do establish the existence of non- 



are about underlying ontic states of a system rather than 
future measurement outcomes (T]| . 

Clifton, Bub and Halvorson (CBH) have taken a dif- 
ferent approach and derived at least part of quantum 
theory from the assumption of (i) a no-signalling prin- 
ciple, (ii) a no-broadcasting principle, and (iii) the im- 
possibility of secure bit commitment [1^.^^ CBH as- 
sume a C*-algebraic framework, which is broad enough 
to include classical theories and quantum theory, but is 
not as broad as the framework presented here. An open 
question is whether something like CBH's proof would 
go through in the broader framework, or whether there 
is some theory (GNST perhaps) that satisfies (i)-(iii) and 
is clearly not quantum. 
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contained. 

A transformation is a map from allowed states of a 
system to allowed states. The map satisfies Eq.([7]), re- 
produced here: 



yP^eS, forO<g, <1, = 



(Al) 



The map should also satisfy 

/(O) - 0. 

(This follows from the interpretation of unnormalized 
states. Recall that if a particular outcome i of some op- 
eration occurs with probability q < 1, then we associate 
with that outcome an unnormalized vector P. Each en- 
try of P gives the joint probability of obtaining outcome 
i for the original operation, and outcome j for a fiducial 
measurement performed immediately afterwards. Thus 
if g = 0, it follows that the associated P = 0. By def- 
inition, an entry in the vector /(O) represents the joint 
probability of getting the following outcomes in sequence: 
outcome i for the original operation, then whatever out- 
come it is that corresponds to the transformation /, and 
then outcome j for a fiducial measurement. But these 
probabilities must all be zero if the probability of out- 
come i is zero.) 

Writing the first of the above equations with i — 1,2, 
and setting P2 = 0, gives 

f{qP) = qf{P) VP e 5, for < g < i. 

Suppose that P is a pure state € S. Pure states are by 
definition normalized. If r > 1, then f(rP) is initially 
undefined because rP ^ S, so we are free to stipulate 
that 

f{rP) = rf{P) yPeS,r>0. 

Define 5+ as the set of all vectors that can be written in 
the form rP with P Cz S and r > 0. It is a convex cone 
[Slf . Eq. (|Aip can be extended shghtly: 



Now suppose that 



VP, e S+ for ri > 0. 

(A2) 



P = ^s,P„ 



(A3) 



APPENDIX A: PROOF OF LINEARITY OF 
TRANSFORMATIONS 

The proof in this appendix is adapted from that of 
Hardy in Ref. [l3|. It is included to keep this work self- 



where P, P^ G S+, and the Si are real. Let i E if 
Si < and « e yl+ if > 0. Rewrite Eq. (|A3p as 
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Each side is a conic combination of vectors in S+, thus 
Eq. (|A2p apphes, and rearranging we get 



/(F) = ^sJ(F.). 



FinaUy, for any vector Q ^ S+, f{Q) can be defined 
uniquely by Hnear extension if Q hes in the subspace 
spanned by S. The action of / on the rest of the vector 
space is arbitrary but may be defined to be hnear. □ 



APPENDIX B: DERIVATION OF TENSOR 
PRODUCT RULE 

As discussed in the main text, the state of a joint sys- 
tem AB can be written 



/ P(a = 1,6 = l\X = l,y = 1) \ 
P{a = 1,6 = 2|X = l,y = 1) 



P 



AB _ 




V 



Proof of Theorem [H This theorem is triviaL Let 
pAB g yAB^ pA ^ yA ^^^j pB ^yB j^q^^q the vector 

Qf^i as the vector with a 1 for the entry corresponding to 
the joint outcome ij of the joint fiducial measurement kl, 
and Os elsewhere. Similarly and Q^- Now identify 



Qt^ki '^it'^ Qik ® Qfi ^^"i extend linearly. 



□ 



Proof of Theorem\^ Consider a joint system AB. For 
each of the fiducial measurements that define the state 
of system B, there must be at least one operation on 
the joint system AB that corresponds to performing that 
measurement. Let this operation for the jth fiducial mea- 
surement be characterized by the set of matrices {Mij}, 
where there is a value of i for each outcome and j is 
fixed. When the transformation My acts on AB, the re- 
sulting state is the unnormalized state P^^ £ S^^ . The 
corresponding reduced state for A is the unnormalized 
state P^. By Constraint [2 P,^ £ S"^. If a fiducial mea- 
surement is now performed on A, the state P^ gives the 
(unnormalized) probabilities for the different outcomes. 
It follows that P^^ can be written in the form 



P 



AB 



(Bl) 



in the subspace of V^^ that is spanned by vectors from 
S^. Eq. HH) follows. The vectors on the right 
hand side of this equation can be assumed normalized, 
since any multiplying factor can be subsumed into the 
corresponding r.^. They can be assumed pure, since a 
mixed state can always be expressed as a convex combi- 
nation of pure states and 0. But any term with will not 
contribute. Theorem [5] follows. □ 

Proof of Theorem\^ Consider a joint system AB and a 
transformation of system A alone. corresponds to 
a matrix such that -> P'^ = M^.P^. The aim 
is to determine the effect of this transformation on the 
joint state P^^ . From Section fll Al this will correspond 
to a matrix such that P^^ P'^^ ~ M^.P"^^. 
But what is the relation between M"^ and M^l 

Consider the following procedure. First, the transfor- 
mation is applied. Then fiducial measurements are 
performed on systems A and B. The (unnormalized) 
joint probabilities for the outcomes of these measure- 
ments are then the entries of the vector P'^^ . However, 
by Assumption [H the ordering of operations on systems 
A and B does not matter. Thus the following procedure 
is equivalent. First, a fiducial measurement is performed 
on system B. Note that the reduced state of system A 
conditioned on a particular outcome for this measure- 
ment is defined by the vector P^^ . Next, the transfor- 
mation is performed on system A. Finally, a fiducial 
measurement is performed on system A. 

In the second procedure, we know how to apply the 
transformation T^, since it is enough to consider its ac- 
tion on system A alone, and we know that P^ P'^ — 
M^.pA. We obtain 



i' k' 



But 



thus 



{M^)ik-i' k' 5jy 5ii- P^^f,,i, . 

i'k' j'l' 



AB 



This holds for all P"^^ £ 5^^, and the action of on 
vectors pAS ^ ^ab arbitrary. It follows that we lose 
no generality in identifying 



AP^ = AP 



□ 



with P^ £ and Qfj as above. Now consider a vector 

U ®W £ V^^, with W £ hut U ± S^, where this 
means that U is orthogonal to all vectors in S"^. From 
Eq.dBll) it follows that {U (g) W).P'^^ = 0. A similar 
resuh holds \i W L and tj £ S^. Thus P^^ hes 



APPENDIX C: GENERIC FEATURES 

This appendix contains proofs of the results of Sec- 
tion IVl 
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Proof of Theorem^ Consider a particular type of sys- 
tem in some theory. Suppose that the subspace spanned 
by allowed states of the system has dimension d and that 
every mixed state has a unique decomposition into pure 
states and 0. The only convex set with this property is a 
simplex with d + 1 vertices. One of these vertices is the 
state 0. It is always possible to find an invertible linear 
transformation N such that the other vertices are trans- 
formed into the vectors (1, 0, 0, . . . , 0), (0, 1, 0, ... , 0), and 
so on. Recall from Section UTTl that if this transformation 
acts on the set 5, then the theory is not changed, since 
R'^.N-^ and M N.M.N'^ for measurements 
and transformations. Hence the system is classical. □ 

Proof of Theorem [5l Consider a system with a set 
of allowed states S, spanning Vs, and let d be the di- 
mension of Vs- Choose a set of d distinct pure states 
{Pi, . . . , Pd} that are hnearly independent and collec- 
tively span Vs ■ Suppose that a particular transformation 
is non-disturbing. Its action on each of the Pi is given 
by M.Pi = CiPi with < Ci < 1. If the system is clas- 
sical, then 5 is a simplex and the set {Pi, . . . , Pd} must 
contain all the pure states. Since the Pi are linearly in- 
dependent, the Ci can be chosen independently without 
contradiction. For any other type of system, there are at 
least d + 1 pure states. Consider a pure state Q that is 
not contained in the set {Pi, . . . , Pd}- If the transforma- 
tion is non-disturbing, then M.Q ~ eQ with < e < 1. 
Since {Pi, . . . , P^} is a basis for Vs, Q has a unique de- 
composition of the form Q — d^Pi, where at least two 
of the di are non-zero. If dj and dk are non-zero, then 
c-j = Ck = e. Thus M acts as e times the identity on the 
subspace of Vs spanned by P, and P^ . By repeating this 
reasoning for every pure state Q, the set {Pi, . . . , Pd} 
can be divided into subsets such that (i) if Pj and Pk are 
in the same subset, then Cj = cu for any non-disturbing 
transformation, and (ii) if Pj and Pk are in different sub- 
sets then there is no pure state Q such that both dj and 
dk are non-zero. Each subset defines a subspace Vi of Vs 
and the theorem follows. □ 



Proof of Theorem Theorem |6] is proven using Theo- 
rem [5l We show that if there is a probabilistic universal 
cloning procedure, then for any two pure states Pi and 
P2, there is a non-disturbing transformation M' such that 
\M'.Pi\ ^ \M'.P2\. This in turn implies that the system 
is classical. 

Suppose that there is a standard state Q and a trans- 
formation M such that for each pure state P, M{P ® 
Q) — cP (g) P. The number c may vary with P but is > 
for all P. Consider a procedure in which a system is in 
the state Pi or P2, an ancilla is added in the standard 
state Q, and the cloning operation {M, F} performed on 
the joint system. The transformation M corresponds to 
the success outcome and F to the fail outcome. If Pi 
and P2 are different states there must be some operation 

{Ni,N2} such that |7Vi.Pi| 7^ l^i-P^I- If cloning suc- 
ceeded, perform this operation on the ancilla. Output 
the result and throw away the ancilla. 

This entire procedure may be regarded as an operation 
on the system alone (see the remarks following Assump- 
tionlll). It can be written O' = {M[,M^,F'}, where M[ 
corresponds to successful cloning followed by the iVi out- 
come, M2 corresponds to successful cloning followed by 
the A'2 outcome, and F' corresponds to failed cloning. 
By construction, each of M[ and Afj is non-disturbing 
and |Af/.Pi| l^/-P2| for at least one of i = 1,2. 

Recalling Theorem [SJ it follows that Vs = 0^ Vi, with 
each Vi containing only one pure state, hence the system 
is classical. □ 



APPENDIX D: DYNAMICS IN GNST AND GLT 

This appendix contains proofs of Theorems [71 [H and [9l 
all of which concern dynamics in GNST or GLT. 

Proof of Theorem This theorem concerns transfor- 
mations of single systems in either GNST or GLT. A 
transformation of an (n, k) system can be written 



( P'{a=l\X = l)\ 
P'ja = k\X = 1) 

p'(a = i|a: = ?i) 

V P'(a = k\X ^n) J 



Mil 




Mm 








M„i 




M 



( P{a=l\X = l)\ 
P{a = k\X = 1) 
P(a = l\X = n) 
V P(a = k\X ^n) J 



(Dl) 



The transformation matrix is M, an nk x nk matrix. If the fiducial measurement X = \ has k outcomes, then 



the top k rows of this matrix determine the probabihties 
of outcomes when the X — I measurement is performed 
on the transformed state P' . Denote the k x nk subma- 
trix consisting of these rows Mi. The next k rows are 
associated with the fiducial measurement X = 2, so de- 
note the corresponding submatrix by M2, and so on. The 
first k columns of Mi multiply into those components of 
P that correspond to the fiducial measurement X = 1 be- 
ing performed. Denote the kxk subsubmatrix consisting 
of these columns Mn. Similarly Mi2, and so on. Note 
that each row in M, considered as a vector R, must rep- 
resent a possible yes/no measurement. This is because if 
the transformation acts on a state P, then R.P gives the 
corresponding entry in the transformed state P', which 
must be between and 1 for all P E S. Furthermore, 
when the transformation is normalization-preserving, the 
rows Rj from a particular Mi satisfy Rj-P — 1; when- 
ever P is normalized. Hence the rows from a particular 
Mi correspond to a multiple-outcome measurement. One 
way of performing this measurement is simply to perform 
the transformation AI first, and then to perform fiducial 
measurement X = i. 

There is some redundancy in a measurement vector R, 
and in the matrix M. If R.P = R'.P VP G S, then R 
and R' represent the same measurement. In particular, 
if R' = R + C, where C.P = VP G 5, then R and R' 
represent the same measurement. An example of such a 
C is 

(7= (1,...,1|-1,...,-1|0,...,0|...), 

where C.P — VP G 5 is ensured by the normalization 
of P. The first step in the proof is to show that any R 
is equivalent in this sense to an R! with all components 
> 0. 

For this, consider the set of allowed normalized states. 
This is precisely the set of vectors satisfying the condi- 
tions 

Y,P{a = i\X = 2) = Y.P{a = i\X = k) Vj,A:, (D2) 

i i 

P{a = t\X = j)>0 Vi,j, (D3) 
^P(a = z|X = !) = !. (D4) 

i 

Define iS+ as the set of vectors of the form rP, with r > 
and P G 5, and note that in the case of GNST or GLT, 
S+ is a polyhedral cone [6l| . It can also be defined as the 
set of vectors satisfying conditions (jD2|l and (|D3p . The 
defining inequalities (|D3p can each be written in the form 
Ci.P > 0, where Ci is a constant vector with a 1 in the 
ith position and Os elsewhere. The equalities (|D2p can 
each be written as the conjunction of two inequalities: 
Dj.P > and Dj.P < for some constant Dj. Define 
TZ+ as the set of vectors R such that R.P > VP G S+. 
This is the set of unnormalized measurements and is the 
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dual cone to S+. It can be shown that if a polyhedral 
cone is defined by {P : Ai.P > Vi}, then the dual cone 
is equal to the conic hull of the vectors Ai. Thus elements 
of 7?.+ can be written 

P = ^A,Q + ^/ij4-, (D5) 

i 3 

where Ai > and /ij can be positive or negative. Finally, 
the vectors Dj all satisfy Dj.P = VP G 5+. Hence any 
R of this form is equivalent to an R of the form 

^ = ^A,a, (D6) 

i 

and without loss of generality, the components of R can 
be assumed > 0. This applies both to R considered as a 
measurement and to R considered as a row of a transfor- 
mation matrix M. 

Assume, then, that M is written in a form with all en- 
tries > 0. To conclude the proof, note that M acting on 
any properly normalized state (satisfying both Eqs. (|D2[) 
and Eq. (jD4[) ) must result in a state that is also prop- 
erly normalized. This implies the following. Consider the 
matrix Mij . Denote the sum of the elements in the first 
column by , the sum of the elements in the second 
column by S2 , and so on. Then = S2 = • • • = S'^-' 
and J2j — 1. Hence the matrix Mij is of the form 
aij times a stochastic matrix, with < aij < 1 and 
C(ij = 1. One may easily check that any transforma- 
tion that is equivalent to a procedure of the form of Fig. O 
is represented by a matrix of this form with aik = 1 for 
some k and aij — for j ^ k. Hence we have obtained 
the general result that any allowed M is a convex com- 
bination of transformations of the form of Fig. [51 □ 

Proof of Theorem O Let an m-outcome measure- 
ment on an (n, k) system have outcomes corresponding 
to i?i , . . . , Rm , and construct the m x nk matrix 




Denote the submatrix consisting of the first k columns 
of N by Ni, that consisting of the next k columns by 
N2, and so on. The same arguments as in the proof of 
Theorem[7]can be used to establish that N can be chosen 
such that all entries are > 0. Then use the fact that 
J2iPi-P = 1 fo'' normalized P, and arguments similar 
to those in the proof of Theorem [3 to establish that 
Ni — UiSi for < Qfi < 1, tti = 1, and Si stochastic. 
The theorem follows. □ 
Proof of Theorem Begin as before by showing that 
without loss of generality, the matrix M can be taken 
to have all entries > 0. This part of the proof is identi- 
cal, except that to conditions (|D2p . (|D3p and (|D4p . one 
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should add the no-signalhng constraints 

J2Pia^i,b = j\X = k,Y = l) = 
j 

J2P{a = hb^j\X^k,Y^2) \fi,k (D7) 
i 

Y^P{a^i,b^j\X^l,Y^l)^ 

i 

Y,P{a = i,b = 3\X^2,Y = 1) yj,l. (D8) 



Like the conditions (|D2[) . these constraints can be writ- 
ten as the conjunction Dj-P > and DjP < 0, and R 
can be written in the form of Eq. (jD5|) , hence in the form 
of Eq. (|D6p . Now impose that P' = AI.P is normaHzed 
for any allowed normalized P, that is any P that satis- 
fies conditions (|D2|, (|D3|, jDil, jDTl, and jDSl. Prov- 
ing that any such M represents a convex combination 
of transformations of the form of Fig. [7] (or the reversed 
form with respect to the two subsystems) is a tedious 
brute force exercise that is omitted. As with Theorem 13 
the proof of Theorem II 01 is a straightforward variation. □ 



